AITopics | proximal operator

Collaborating Authors

proximal operator

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Group-Aware Matrix Estimation and Latent Subspace Recovery

Golubovic, Hamza, Shen, Matthew, Allen, Genevera I., Zikry, Tarek M.

arXiv.org Machine LearningMay-21-2026

Modern matrix completion problems often involve heterogeneous data whose rows simultaneously belong to many meta-categories, such as demographic and age groups in recommendation systems, or region and recording session labels in neural electrophysiological experiments. Standard low-rank estimators impose a single global latent geometry, which can recover average structure but may smooth away subgroup-specific variation, especially when observations are unevenly distributed across groups. We introduce Group-Aware Matrix Estimation (GAME), a convex estimator for overlapping subgroup-wise low-rank matrix estimation. GAME regularizes category-specific submatrices through overlapping nuclear-norm penalties, allowing related groups to borrow information while preserving local latent structure in a shared coordinate system. We provide finite-sample guarantees for both reconstruction error and subgroup-specific subspace recovery, showing how performance depends on sampling density, subgroup rank, and overlap structure. Experiments on synthetic, recommendation, ecological, and neuroscience datasets show that GAME is most beneficial in structured missingness regimes, where subgroup-aware regularization improves both reconstruction accuracy and latent subspace fidelity. Across these benchmarks, GAME is competitive or best among global low-rank, side-information, and modern imputation baselines, with the largest gains when subgroups exhibit distinct low-rank structure.

artificial intelligence, machine learning, matrix completion, (14 more...)

arXiv.org Machine Learning

2605.20559

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Provable convergence guarantees for black-box variational inference

Neural Information Processing SystemsApr-29-2026, 20:36:01 GMT

Black-box variational inference is widely used in situations where there is no proof that its stochastic optimization succeeds. We suggest this is due to a theoretical gap in existing stochastic optimization proofs--namely the challenge of gradient estimators with unusual noise bounds, and a composite non-smooth objective. For dense Gaussian variational families, we observe that existing gradient estimators based on reparameterization satisfy a quadratic noise bound and give novel convergence guarantees for proximal and projected stochastic gradient descent using this bound. This provides rigorous guarantees that methods similar to those used in practice converge on realistic inference problems.

artificial intelligence, estimator, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report (0.46)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

AHighly-Efficient Group Elastic Net Algorithm with an Application to Function-On-Scalar Regression

Neural Information Processing SystemsApr-25-2026, 19:41:14 GMT

Feature Selection and Functional Data Analysis are two dynamic areas of research, with important applications in the analysis of large and complex data sets. Straddling these two areas, we propose a new highly efficient algorithm to perform Group Elastic Net with application to function-on-scalar feature selection, where a functional response is modeled against a very large number of potential scalar predictors. First, we introduce a new algorithm to solve Group Elastic Net in ultrahigh dimensional settings, which exploits the sparsity structure of the Augmented Lagrangian to greatly reduce computational burden. Next, taking advantage of the properties of Functional Principal Components, we extend our algorithm to the function-on-scalar regression framework. We use simulations to demonstrate the CPU time gains afforded by our approach compared to its best existing competitors, and present an application to data from a Genome Wide Association Study on childhood obesity.

artificial intelligence, group elastic, machine learning, (14 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report > Experimental Study (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Learning to Learn Graph Topologies

Neural Information Processing SystemsApr-25-2026, 02:26:31 GMT

Learning a graph topology to reveal the underlying relationship between data entities plays an important role in various machine learning and data analysis tasks. Under the assumption that structured data vary smoothly over a graph, the problem can be formulated as a regularised convex optimisation over a positive semidefinite cone and solved by iterative algorithms. Classic methods require an explicit convex function to reflect generic topological priors, e.g. the `1 penalty for enforcing sparsity, which limits the flexibility and expressiveness in learning rich topological structures. We propose to learn a mapping from node data to the graph structure based on the idea of learning to optimise (L2O). Specifically, our model first unrolls an iterative primal-dual splitting algorithm into a neural network. The key structural proximal projection is replaced with a variational autoencoder that refines the estimated graph with enhanced topological properties. The model is trained in an end-to-end fashion with pairs of node data and graph samples. Experiments on both synthetic and real-world data demonstrate that our model is more efficient than classic iterative algorithms in learning a graph with specific topological properties.

artificial intelligence, graph, machine learning, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Neural Information Processing SystemsApr-25-2026, 00:11:17 GMT

Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we propose to relax BMF continuously using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that it converges quickly, recovers the ground truth precisely, and estimates the simulated rank exactly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful.

artificial intelligence, machine learning, matrix, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > Canada (0.68)
North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

Proximal Deep Structured Models

Shenlong Wang, Sanja Fidler, Raquel Urtasun

Neural Information Processing SystemsApr-22-2026, 13:45:45 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, proximal operator, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Machine Learning-Assisted High-Dimensional Matrix Estimation

Tian, Wan, Yang, Hui, Lian, Zhouhui, Zhang, Lingyue, Peng, Yijie

arXiv.org Machine LearningMar-31-2026

Efficient estimation of high-dimensional matrices--including covariance and precision matrices--is a cornerstone of modern multivariate statistics. Most existing studies have focused primarily on the theoretical properties of the estimators (e.g., consistency and sparsity), while largely overlooking the computational challenges inherent in high-dimensional settings. Theoretically, we first prove the convergence of LADMM, and then establish the convergence, convergence rate, and monotonicity of its reparameterized counterpart; importantly, we show that the reparameterized LADMM enjoys a faster convergence rate. Notably, the proposed reparameterization theory and methodology are applicable to the estimation of both high-dimensional covariance and precision matrices. Keywords: ADMM; High-dimensional; Learning-based optimization; Matrix estimation. 1. Introduction High-dimensional matrix estimation--covering both covariance and precision matrix estimation--constitutes a cornerstone of modern statistics and data science [1, 2, 3]. Accurate covariance estimation enables the characterization of dependence structures among a large number of variables [4, 5, 6], which is indispensable in diverse domains such as genomics [7, 8], neuroscience [9], finance [10, 11, 12], and climate science [13, 14]. Over the past two decades, substantial progress has been made in the statistical theory of high-dimensional matrix estimation, particularly with respect to the accuracy of estimators, including properties such as sparsistency and consistency [5, 15, 16]. However, in empirical studies, the dimensionality is often only on the order of tens to hundreds, and in many cases is comparable to the sample size [21, 22, 23, 24]. This observation highlights a notable gap between the statistical theory of estimators and the practical challenges of their computational implementation.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2603.28346

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Liaoning Province > Dalian (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.34)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

The Conjugate Domain Dichotomy: Exact Risk of M-Estimators under Infinite-Variance Noise in High Dimensions

Agiropoulos, Charalampos

arXiv.org Machine LearningMar-31-2026

This paper studies high-dimensional M-estimation in the proportional asymptotic regime (p/n -> gamma > 0) when the noise distribution has infinite variance. For noise with regularly-varying tails of index alpha in (1,2), we establish that the asymptotic behavior of a regularized M-estimator is governed by a single geometric property of the loss function: the boundedness of the domain of its Fenchel conjugate. When this conjugate domain is bounded -- as is the case for the Huber, absolute-value, and quantile loss functions -- the dual variable in the min-max formulation of the estimator is confined, the effective noise reduces to the finite first absolute moment of the noise distribution, and the estimator achieves bounded risk without recourse to external information. When the conjugate domain is unbounded -- as for the squared loss -- the dual variable scales with the noise, the effective noise involves the diverging second moment, and bounded risk can be achieved only through transfer regularization toward an external prior. For the squared-loss class specifically, we derive the exact asymptotic risk via the Convex Gaussian Minimax Theorem under a noise-adapted regularization scaling. The resulting risk converges to a universal floor that is independent of the regularizer, yielding a loss-risk trichotomy: squared-loss estimators without transfer diverge; Huber-loss estimators achieve bounded but non-vanishing risk; transfer-regularized estimators attain the floor.

artificial intelligence, machine learning, theorem 3, (18 more...)

arXiv.org Machine Learning

2603.28359

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Stochastic Three-Composite Convex Minimization

Alp Yurtsever, Bang Cong Vu, Volkan Cevher

Neural Information Processing SystemsMar-23-2026, 10:29:30 GMT

We propose a stochastic optimization method for the minimization of the sum of three convex functions, one of which has Lipschitz continuous gradient as well as restricted strong convexity. Our approach is most suitable in the setting where it is computationally advantageous to process smooth term in the decomposition with its stochastic gradient estimate and the other two functions separately with their proximal operators, such as doubly regularized empirical risk minimization problems. We prove the convergence characterization of the proposed algorithm in expectation under the standard assumptions for the stochastic gradient estimate of the smooth term. Our method operates in the primal space and can be considered as a stochastic extension of the three-operator splitting method. Numerical evidence supports the effectiveness of our method in real-world problems.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: